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ABSTRACT 

The frequencies of solar oscillations are known to change with solar activ- 
ity. We use Principal Component Analysis to examine these changes with high 
precision. In addition to the well-documented changes in solar normal mode os- 
cillations with activity as a function of frequency, which originate in the surface 
layers of the Sun, we find a small but statistically significant change in frequen- 
cies with an origin at and below the base of the convection zone. We find that at 
r = (O.712+°:°°^^)i?0, the change in sound speed is 5c^/c^ = (7.23 ± 2.08) x lO^^ 
between high and low activity. This change is very tightly correlated with solar 
activity. In addition, we use the splitting coefficients to examine the latitudi- 
nal structure of these changes. We find changes in sound speed correlated with 
surface activity for r > 0.9Rq. 

Subject headings: Sun: helioseismology. Sun: activity. Sun: interior 



1. INTRODUCTION 



Normal modes of oscillation of the Sun have provided a powerful tool to peer into 
the solar interior. In particular, modern experiments, both ground- and space-based, have 
measured the intermediate degree global oscillation spectrum with high precision since the 
beginning of solar cycle 23. Acc urate determinations of inter ior structure and dynamics are 
now possible (see, e.g., review by lChristensen-Dalsgaardll2002l ). These measurements contain 
a wealth of information about the fundamental causes of solar variability. 

It is generally believed that t he seat of the solar dynamo is located at the base of 
the convection zone (e.g., review by ICharbonnead l2005l ) . Because helioseismology provides 
the only direct measurements of this region of the solar interior, these results can play 
an important role in constraining dynamo theories. In particular, a number of authors have 
attempted to use global and local helioseismic techniques to determine hmits on the strength 
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of the magnetic field at the base of the convection zone (e.g.. lChou et al.ll2003l . and references 
therein). In this paper, we attempt to improve hehoseismic measurements of changes in this 
region. 

Global modes of solar oscillation are described by three numbers that characterize the 
spherical harmonics that are used to define the horizontal structure of the mode. These are 
(1) radial order n that related to the number of nodes in the radial direction, (2) the degree 
£ that is related to the horizontal wavelength of the mode, and (3) the azimuthal order m 
that defines the number of nodes along the equator. In a spherically symmetric star, the 
2£ + 1 modes of an (n, €) multiplet are degenerate, but effects that break spherical symmetry 
such as magnetic fields or rotation lift the degeneracy and results in frequency splittings. 
The frequencies Vntm of the modes within a multiplet can be expressed as an expansion in 
orthogonal polynomials: 



Jmax 



+ ^a,(n,£)P] 



(1) 



Early investigators (e.g., Duvall et al. IQsd ) commonly used Legendre polynomials, whereas 
now one often uses the R itzwoUer-Lavely formulation of the Clebsch-Gordan expansion 
(IRitzwoUer fc Lavelylll99ll ) where the basis functions are polynomials related to the Clebsch- 
Gordan coefficients. In either case, the coefficients aj are referred to as a-coefficients or split- 
ting coefficients. Solar structure is determined by inverting the mean freque ncy Vne., while 
the o dd-order coefficients 01,03, . . . depend principally on the rotation rate (IDurney et al. 



19881 ) and reflect the advective, latitudinally symmetric part of the perturbations caused 
by rotation. Hence, these are used to determine the rate of rotation inside the Sun. The 
even order a coefficients on the other hand result from magneti c fields and asphericities in 
solar structure, and the sec ond order effects of rotation (e.g., iGough fc ThompsonI Il990l : 
Dziembowski fc Goodelll99ll ). 



Solar oscillation frequencies are kno wn to vary on tirn e scales related to the solar ac- 
tivity cycle. This was first s ugge sted bv IWoodard fc Noved (Il985[) and confirmed soon af- 
ter by lElsworth et"aD Jl99oh and llibbrecht fc WoodardI Jl99of )~ It w as quickly estab lished 



that the frequency shifts were strongly correlated with surface activ ity (IWoodard et al. 



1991 



Bachmann fc Brownlll993l : lElsworth et al.lll994l : lRegulo et al.lll994 etc.). iLibbrecht fc Woodard 
Jl990h observed that the frequency shifts depended very strongly on mo de fr equency and 
very w eakly on degree £ of the mode, and lAnguera Gubau et al.l (119921 ) and lElsworth et al. 



(Il994j ) confirmed these results. These authors concluded that all or most of the physical 
changes responsible for the changes in frequency were confined to the shallow layers of the 
Sun. In general, this picture has been confirmed in more recent studies (e.g., observational re- 
sults: iHowe et al.lll999l . l2002l : [Basu fc Antialboool : Iveriier et al.ll20ol : IPziembowski fc Goode 
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2005 . etc., and theoretical results: iGoldreich et al.l 119911 : iBalmforth et al.l 119961 : iLi et al. 
2003 |, etc.). A change in the second helium ionization zone at r = 0.98-Rf;i, first suggested 



by IGoldreich et al. (119911) and iGoughl (120021 ) . has been confirmed bv lBasu fc Mandell (120041 ) 
and IVerner et al.l (120061 ) . 



The even-order mode splitti ng parameters sample effects of structural asphericities on 
the mode frequency. iKuhnI (119881 ) suggested that they were correlated with observed changes 
in surface temperature. Subsequent work has shown that the aspherical components of the 



mode frequencies are tightly correlated with surface magnetic activity (IHowe et al.l Il999 



Antia et al.l 1200 ll ). This high correlation lends further credence to the idea that frequency 
shits are caused by surface and/or near-surface effects. This can be tested directly with 
high degree modes that sample the near-surface layers of the Sun. However, as the degree / 
increases, globa l modes becorae increasingly hard to measure precisely due to the decre ase in 
mode lifetimes ( iRhodes et al.lll998l : iRabello-Soares et al.ll200ll : iKorzennik et al.ll2004j ). The 
lack of reliable measurements of these modes has led some authors to use ring diagrams to 
measure high degree modes and measure changes in the shallow layers of the solar convection 
zone. These stud ies have confirme d that structural changes do occur in the near-surface 
layers of the Sun (IBasu et al.l 120071 ). 



Direct inversions of changes in the structure of the solar interior probed by the spheri- 
cally symmetric global modes have n ot yielded any measurable differences in the deep interior 
( lBasull2002l : lEff-Darwich et al.ll2002l) and there have been up p er limits placed on the changes 
at t he base of the convectio n zone (|Eff-Darwich et al.ll2002l ). IChou fc SerebryanskiyI (|2005l ) 
and ISerebryanskiy fc Ghoul (120051 ) have presented evidence of a possible change in mode 
frequencies with lower turning points near the base of the convection zone. 

Internal dynamics, on the other hand, show clear and unequivocal evidence of change 
over the course of the solar cycle. In the convection zone, bands of different rotational 
velocities (called zonal flows) h ave been shown to migrate poleward at high l atitudes an d 
equator- ward at low latitudes (ISchoul Il999l : iHowe et al.l l2000l : lAntia &: Basul l2000l . l200ll ). 
Temporal variations in dynam ics have been shown to penetrate throughout the entire con- 



vecti on zone, and even below (IVorontsov et al.ll2002l : iBasu fc Antiall2006l : iHowe et al.l 12005 
2006h . 



One conclusion that can be drawn unequivocally from previous studies of the changes in 
solar structure is that any changes deeper than those in the outermost layers of the Sun are 
very small, and hence very d i fficult to detect through their signatures in oscillation frequen- 



cies. 



Ghou fc SerebryanskiyI (120051 ) used a smoothing technique to attempt to remove the 



effect of surface variations, and found that the scaled frequency differences showed evidence 
of change near the base of the convection zone, but could not say more about the physi- 
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cal nature of the changes. Attempts to inver t the frequencies directly have neve r shown any 
changes larger than the inversion errors (e.g.. lBasijl2002l : lEff-Darwich et al.ll2002l ). Therefore, 
although current helioseismic instruments have determined the solar oscillation frequencies 
with tremendous precision, statistical errors in those frequencies are still too large to make 
any direct detections of structure changes in the deep interior. 

In this work, we take a somewhat different approach to attempting to detect changes 
at the base of the convection. We use Principal Component Analysis (PCA) to separate 
the frequency differences over the last solar cycle into a linear combination of different time- 
dependent components. This has the effect of decreasing the effects of measurement errors 
in the measured helioseismic frequencies, which allows us to isolate as precisely as possible 
the changes in frequency over time. In section we describe the data used, and the methods 
employed to analyze them. In section [31 we present the results in detail. Finally, in section 
m we discuss the significance of the results and presents our conclusions. 



DATA & ANALYSIS 



2.1. Data 



For this work, we use helioseismic global-mode data sets from two different projects, 
one from the Michelson Doppler Imager (MDI) on-board the SOHO spacecraft, and the 
other from the Global Oscillations Network Group (GONG). Th e MDI mode sets consist of 
frequencies and splittings obtained from 72-day long time series flSchoulll999l ). We use 54 of 
these sets, spanning the period from 1996 May 1 to 2007 May 16. The GONG mode sets 
are derived from 108-day time series (IHill et al.l 119961 ). Although GONG provides sets that 
overlap in time, we only use non-overlapping sets in the present work. We use 40 sets from 
the period 1995 May 7 to 2007 April 14. Because the two sets are from completely different 
instruments and independent data reduction pipelines, any real solar signatures should show 
up in both sets. The /-modes do not sample the deep interior and are dominated by surface 
effects, so we exclude them from our study. The n = 1 modes have larger errors than the 
higher order modes, and so we exclude them as well. The included modes are low and 
intermediate degree modes up to £ = 176, with order n from 2 to 16. 

As a proxy for total solar activity, we use the 10.7cm radio flux measurements taken 
by the Dominion Radio Astrophysical Observatory (DRAO) 0. This measurement has been 



Data can be found at 



http://www.drao-ofr.hia-iha.nrc-cnrc.gc.ca/icarus/www/sol_honie.shtnil 
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found to be very tightly correlated with solar activity (e.g., iTappind 119871 ). We average 
the -Fio.7 measurements over 72-day periods for comparison with MDI data, and 108-day 
periods for comparison with GONG data. For latitudinal structure in surface activity, we 
use the surface magnetic field, taken from SOHO/MDI synoptic maps §. The magnetic 
field strengths are averaged over the same 72-day periods as the -F10.7 data, and over the 
appropriate ranges in latitude. 



2.2. Method 



We use Principal Component Analysis to describe the temporal variations of the weighted 
frequencies as a small number of uncorrelated basis functions. The use of PGA is a com- 
mon techniqu e in multivariate data a nalysis to reduce dimensionality and expose underlying 



variables (see iMurtagh fc Heckl 119871 . Ghapter 2, for a discussion of its astrophysical appli- 



cations). A brief description of the method and discussion of its limitations is included in 
the Appendix. PGA is a technique whereby a set of observations is expressed as a set of 
uncorrelated vectors. The usefulness of the technique arises from the fact that variation of 
the data about the first vector is maximal, and about the second vector, maximal subject 
to the constraint that it be orthogonal to the first vector, and so on. In other words, PGA 
provides a very efficient linear representation of a data set, and it is able to substantially 
reduce the dimensionality of the data set without losing any significant information. 

It has been known for many years that the frequencies of the solar global modes of 
oscillation change with the solar activity cycle. With the arrival of high quality mea- 
surements of intermediate degree modes, it is clear that the amount of frequency shift 
over the solar cycle is dependent on the mode. Each mode has an associated mode in- 
ertia Eni- Frequency differences can be scaled by the quantity Qne = Ene/ Eo^Uni), where 
Evtivne) is the inertia of the £ = modes in terpolated to the frequency of the {n, i) mode 



(iGhristensen-Dalsgaard fc Berthomieul Il99ll ). This scaling accounts for the fact that the 
frequencies of modes with lower inertia are changed by a larger amount than modes with a 
higher inertia by the same underlying perturbation. When scaled in this way, the degree- 
dependence of the frequencies over the solar cycle largel y vanish, and the frequency chan ges 



become slowly varying functions of frequency only (e.g., iGhaplin et al.l 12001 



uency cnan g 
Basull2002h . 



Our data points are the scaled frequency differences Qni^^ni/^ne- For the MDI obser- 
vations, there are 54 total mode sets in our work, which, when one is removed to be used 



^MDI synoptic maps of Carrington rotations can be found at 
http://soi.stanford.edu/niagnetic/index6.htmll 
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as the base set, giving us 53 sets of scaled frequency differences. Tliere are 40 mode sets in 
our GONG data set, or 39 difference sets. Because the PGA method requires a complete 
covariance matrix (see the Appendix), we can only include a mode if it is present in all 
mode sets. This dramatically reduces the amount of usable information, particularly since 
many excluded modes are missing in only one or two sets out of 53. For these modes, it is 
possible to interpolate a value for the missing frequency differences. In this case, because 
most of the frequency differences for the mode in question will be actually observed, any 
errors introduced through the interpolation will have a negligible effect on the PGA results. 
We tested two interpolation methods — one a spline interpolation along ridges in the l-v 
diagram (interpolated from modes with the same radial order n), and the other a linear 
interpolation between neighboring modes in time. When tested against existing modes, the 
interpolation along time proved superior, reproducing the actual data to better than a factor 
of 1.2(7, while the interpolation along the ridge results had a 2a distribution. Therefore, only 
results using the time interpolation are discussed in this paper, but the PGA results using 
interpolation over degree were entirely consistent. Finally, to ensure that the PGA is robust, 
Monte Garlo simulations were performed to ensure that the exclusion of certain modes would 
not affect the results. The PGA analysis of these data appears to be very robust. Errors in 
the PGA components were computed by means of a Monte Garlo simulation. 

In addition to the mean frequencies z/„^, which contain information about the spherically 
symmetric part of the solar interior, we have the even-order splitting coefficients a2j{n,i), 
which allow us to reproduce mode frequencies as a function of latitude. Latitudinal frequen- 
cies as a function of colatitude 6 can be obtained as follows: 

Vnl{0) = Vni+y -p. P2fc(cOS6'), (2) 

k 

where Qik are the angular integrals given by 

r d<P r sm9d9Yr{Yn*P2k{cose) = -Q.^vfUim), (3) 
Jo Jo 

and P2fc(cos 9) are Legendre polynomials of degree 2fc, and the "Pgfe the same polynomials 
as in equation [H 

Ultimately, we are interested in helioseismic data for what they can tell us about the solar 
interior. To extract this information, we invert these data sets for the parameters of interest. 
We use two different inversion techniques, and invert for the change in sound speed relative 
to the comparis on frequency. The first te chnique used is Subtractively Optimized Local 



Averages (SOLA lPijpers &: Thompsonlll994j ). A descripti on of the implementation use d here 



and how to select inversion parameters can be found in iRabello-Soares et al.l (119991 ) . The 
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second inversion technique is that of Regularized Least Squares (RLS). The i mplementation 
used h ere and the selection of invers ion parameters have been discussed in lAntia &: Basu 
( 119941 ) and lBasu fc Thompson! (Il996l ). The use of the se two tech niques in tandem is crucial 
because the techniques are complimentary in nature (ISekiil 119971 ) . Inversions can be trusted 
if both inversion techniques return the same results. 

The quantities which we invert are appropriately scaled eigenvectors from the PCA of 
the frequency differences. The differences are taken relative to some fiducial set, usually 
the first mode set (corresponding to activity minimum at the beginning of solar cycle 23). 
The eigenvectors, ^j, are normalized so that ■ = 1. Each data set is the vector x(t) 
that contains the individual mode frequencies QniSune/^ne- These data can be represented 
as a linear combination of the the eigenvectors with coefficients given by Ci(t) = C,ijXj{t) 
(see the Appendix). The scaled eigenvector Ci(t)^^, therefore, has a physically reasonable 
magnitude, and in all following cases, it is this quantity that we invert. In general, we choose 
the coefficient with the largest magnitude, which represents the largest variation in sound 
speed. This coefficient is usually the one associated with the set at maximum activity. 



3. RESULTS & DISCUSSION 

3.1. Mean frequencies 

We have performed the PCA decomposition of the MDI mean frequencies with respect 
to the first set (set #1216, start day 1996 May 1, end day 1996 July 12). This is a low 
activity set. The first four eigenvectors — ^4 are shown in Fig. [H both as a function of 
frequency and of the lower turning point of the modes. The scaling coefficients for the first 
four eigenvectors are presented in Fig. [2l Also shown in the figure is the difference in the 
10.7 cm radio flux between the first set and the subsequent sets. 

When plotted as a function of frequency, the first eigenvector $,1 appears to be almost 
entirely due to near-surface effects — it is seems to be a slowly varying function of frequency 
only. As a function of r^, however, a change in the average level of the frequency differences 
can be seen below the base of the convection zone (around r ^ 0.713i?Q — marked by a 
vertical line in Fig. [1]). This implies a time dependent change near the base of the solar 
convection zone. We conclude, therefore, that there is a statistically significant component 
of the frequency variability picked out by that does not originate at the surface. The 
coefficients for are tightly correlated with the 10.7cm radio flux, a proxy for surface 
activity; the correlation coefficient is 0.99. A linear regression fit to the change in radio 
flux relative to solar minimum, AF10.7, gives the relation between the 10.7cm flux and the 
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coefficients of ^-^ as: 

ci = (2.53 X 10~^ ± 3.13 X 10"^)AFio.7 + 3.95 x 10"^ ± 1.6 x 10"^ (4) 

Thus, all changes, as manifest by are tightly correlated with surface magnetic activity. 

The second eigenvector, S'lso shows variability over the solar cycle. Comparing it as 
a function of u and of rt, it is clear that the structure is neither a pure function of frequency 
nor of lower turning point. The first four obvious downturn features correspond to modes of 
order n = 2, 3, 4, and 5. As a function of r^, it seems clear that the structure is concentrated 
at or near the surface. However, the differences can not be fit by the usual "surface term", 
and further examinatio n reveals that it cannot be fi t eve n by h i gher-o rder surface terms of 
the form considered in iBrodsky fc Vorontsov Jl993h and lAntial (119951 ) . The coefficients for 



^2 offer some hint as to what is going on. They exhibit no obvious solar cycle dependency, 
but rather seem to be a roughly linearly decreasing function of time. There does not seem 
to be any periodicity on a scale of eleven years or shorter. Larson & Schou (2008) have 
undertaken an in depth study of the systematics in the MDI data reduction pipeline. The 
plate scale in the MDI instrument has changed slightly over the course of SOHO's mission, 
and they have shown that the effect of the resultant error in the measured radius introduces 
errors that look exactly like the ^2 eigenvector computed in this work. We do not believe, 
therefore, that the features in ^2 cire solar in origin, but are rather artifacts from the MDI 
data reduction pipeline. As we will show below, analysis of GONG data confirms our belief. 

The third and fourth eigenvectors, £3 and ^4 do not exhibit any significant structure 
at all. The vector ^3 shows a slight trend with frequency, but the scaling coefficients C3 
are normally distributed, and we cannot identify any physical significance in this eigenvec- 
tor. The remaining eigenvectors are statistically consistent with Gaussian noise distributed 
around zero. We conclude, therefore, that the temporal variation of the MDI frequencies is 
dependent on a linear combination of $,1 and ^2 alone. In Fig. [3l we show two data sets 
reconstructed from the first two eigenvectors. This figure shows that the PGA decomposi- 
tion does indeed accurately capture the original data while significantly reducing the random 
scatter in the data. The residuals normalized by the errors are plotted, and are consistent 
with Gaussian noise, with distributions of l.lcx and 0.9a for the two cases. Having confirmed 
that the third and subsequent eigenvectors are Gaussian noise, we do not consider them fur- 
ther in this paper. This reduction in noise is important for attempting to invert the small 
signatures we are looking at here. 

The fact that the PGA is applied to a set of mode sets relative to a single base set raises 
the possibility that we are unduly infiuenced by the choice of that base set. We therefore 
repeat the PGA taking a base set from halfway up the solar cycle: MDI set #2224 (start 
date: 1999 February 3, end date 1999 April 16, and an activity level during the 72 day period 
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of -Fio.7 = 130.7 SFU). The eigenvectors are consistent with those obtained from the base 
#1216 set insofar as their inner products are close to unity. We conclude, therefore, that 
the PCA results are not unduly influenced by the choice of base set. 

The mode parameters measured from GONG data are somewhat noisier than those 
measured from MDI data. Nevertheless, a similar analysis of the GONG data allows us 
to confirm results obtained from MDI data. The first two eigenvectors from the GONG 
observations are shown in Fig. m As with the MDI data, plotted against shows some 
structure in the deep interior. The second eigenvector, shows no more structure than the 
MDI ^3 eigenvector, and the remaining eigenvectors appear to be Gaussian noise, reinforcing 
our conclusions that the structure in ^2 from the MDI data set is instrumental in origin. 

We invert the appropriately scaled eigenvectors to determine the change in the sound 
speed as a function of radius. We show the results of the inversion of the vector with 
the #1216 base set in Fig. [51 This vector has been scaled by the coefficient for set #3160 
(start date: 2001 August 27) in order to give the inversion results physical meaning. It is 
readily apparent that at a depth of r «i O.713i?0, near the location of the convection zone 
base, there is a change in the sound speed. This feature is well matched in both the RLS 
and the SOLA inversions, which implies that the feature is actually present in the data. 
The depression in sound speed at the base of the convection zone with increasing activity 
is matched with a corresponding enhancement below the convection zone. We invert the 
GONG data as well. The inversion results are shown in Fig. O There is a clear feature 
at the base of the convection zone, as seen with MDI data. The presence of this feature 
in the data of an independent instrument with an independent reduction pipeline is very 
encouraging — it strongly implies that the changes implied by the inversions are present in 
the Sun itself rather than artifacts in the data. Figure [5] shows the difference in sound speed 
between two extrema in the solar cycle. To show how the interior changes with time, in Fig. 
iniwe show the sound speed inversions at three radii as a function of AF10.7. 

We confirm, therefore, that the signature in mode frequencies is consistent with a change 
in structure at the base of the convection zone. Other authors have looked for changes in 
this region, but have not found any changes. The change that we have detected, while 
statistically significant, is very small, and it is only with the benefit of an enti re solar cycle's 



worth of high precision observations that we can detect changes at this level. iBasu fc Antia 



( I2OOII ) examined the mode frequencies for evidence of a change in the location of the base 
of the convection zone. They did not detect any change, and the sound speed profile that 
we find in Fig. is very different from the one they expected from a change in the base of 
the convection zone. This implies that, even if the change we are detecting is thermal in 
nature, it is unlikely to be related in any way to a change in the position of the base of the 
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convection zone. 



These inversions have been done assuming that the frequency differences are a result 
of a change in sound speed only. It is almost certain, however, given how tightly correlated 
this change is with solar activity, that the observed changes are related in some way to 
changes in the internal magnetic fields. What we have really inverted for, therefore, is a 
change in the wave speed. If we assume that the entire change is due to a change in the wave 
pr opagation speed d ue the presence of magnetic fields, in other words that 6c^/c^ ~ "^l/c^; 



m 



Basu et al.l (120041 ). we can obtain a value for B. The change at the base of the convection 
zone is Sc^/c^ = (7.23 ±2.08) x 10~^, which implies a magnetic field strength of 29 kG. This 



is consistent with the results of earlier authors — iGoode fc Dziembowskil (I19931) placed an 
upper limit of IMG on the toroidal field at the base of the convection zone. andlBasul (119971) 
found that the magnetic field in this region could not exceed SOOkG. IChou &: Serebryanskiy 



(j2002|) found somewhat stronger fields (400 to 700kG). 



3.2. Latitudinal changes 

The MDI and GONG data sets also contain splitting coefficients. The even-order co- 
efficients contain information about the non-spherically symmetric structure in the solar 
interior. Because the surface manifestations of solar activity are strongly latitudinally de- 
pendent, we have used these coefficients to study the temporal variability of structure at 
different latitudes. The frequencies corresponding to different latitudes are computed using 
equation O The PGA procedure is performed for each latitude as was done with the mean 
frequencies, and as usual is done with respect to set #1216. The first eigenvector for six 
different latitudes is shown in Fig. [71 When plotted as a function of frequency, the latitudes 
from the equator to 30° show a similar frequency dependence as in the case of the mean 
frequencies in Fig. [1] When plotted as a function of r^, we see change at and below the 
convection zone base. The higher latitudes show no structure, and the eigenvectors for these 
latitudes are consistent with Gaussian noise. The scaling coefficients for each latitude as a 
function of time are shown in Fig. [HI along with the surface magnetic field. Like the scal- 
ing coefficients for the mean frequencies shown in Fig. [2l the latitudinal scaling coefficients 
closely follow the surface activity. 

We show the sound speed inversions for the equator, 15°, 30°, and 45° in Fig. [HI 
The errors in the eigenvectors are larger here than for the mean frequencies, in large part 
because each frequency is a combination of mean frequency and splitting coefficients, each 
with their own errors. Nevertheless, there are several points of interest in these inversions. 
The first is a clear sound speed change for radii greater than approximately r = O.86-R0 
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at 15° and the equator. The change seen in the SOLA inversion results is well matched in 
this region by the RLS inversion results, and the significance of the change approaches 2a. 
There is the possibility, though less statistically significant, of a change at greater depth, 
i.e., approximately r = 0.82i?Q. At 30°, a change in sound speed through the tachocline is 
seen in the RLS results, but it does not appear to be as clear in the SOLA results.. It is 
unclear whether or not this result is statistically significant. At latitudes of 45° and higher, 
the inversion errors become large, and the inversion results themselves become extremely 
sensitive to the inversion parameters. We conclude that there are no structural changes in 
the solar interior at or above a latitude of 45° large enough to be present in our data sets. 

We have also analyzed directly the change in solar structural asphericity over the course 
of the solar cycle, by taking the scaled frequency differences of high latitudes with respect 
to the equator. For this analysis, therefore, we have 54 mode sets at each latitude. The 
difference is equator— latitude. Figure [TO] shows the eigenvectors for five different latitudes 
with respect to the equator. Clearly, the signal to noise in these data are worse than either 
the latitudinal frequencies or the mean frequencies, but radial structure is discernible in the 
eigenvectors for the equator relative to 15°, 30°, and 45°. In fig. [H], the scaling coefficients for 
the asphericity terms are shown. There is an evident phase delay in the scaling coefficients 
of 15o and 30° with respect to the equator. This is expected from Fig. [HI The scaling 
coefficients for higher latitudes (45° and 60° with respect to the equator) show no structure 
except for an abrupt change from mostly positive to negative, corresponding closely to the 
peak of the solar activity cycle. 



4. CONCLUSIONS 

We have analyzed the changes in solar oscillation frequencies over the course of solar 
cycle 23. In order to reduce the effects of measurement errors and detect the faintest signa- 
tures of solar variability possible, we have employed a Principal Component Analysis of the 
frequencies. The mean frequencies are known to vary over the solar cycle, and this variation 
is known to be tightly correlated with surface activity. We have found this correlation as 
well. 

In addition to the frequency dependent change which these earher authors have detected, 
we have found a small but statistically significant chan ge in modes with turnin g point s at or 



below the convection zone. This confirms the result of [Chou &: SerebryanskiyI (|2005| ). This 
signature is present in both the MDI and the GONG data sets. We have inverted these 
results to obtain the difference in sound speed, and we have confirmed that there is a change 
in solar structure at the base of the convection zone over the course of the solar cycle. The 
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measured change at a radius of r = (O.712+°;°°^^)i?0 is bc^jc^ = (7.23 ± 2.08) x 10~^ where 
the errors in radius are a measure of the resolution of the inversion taken from the first and 
third quartile points of the inversion kernel. 

We have also used the sphtting coefficients to investigate how the changes in structure 
vary over latitude. We have found that the changes in the solar interior are tightly correlated 
with the latitudinal distribution of surface activity. The most statistically significant changes 
detected in the analysis are changes in sound speed in the outer ten percent (by radius) of 
the solar interior. 
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A. PRINCIPAL COMPONENT ANALYSIS 

We give a bri ef descrip t ion of the Principal Component Analysis technique that follows 
the one found in iKendalll (1980). This technique assumes that we begin with a set of 
observations, each one consisting of a set of data points (for example, a set of spectra of 
different objects, or a set of images, or, as in this paper, sets of mode frequency measurements 
at different points in time). Assume we have m observations, each with n data points. The 
m vectors Xi each contain the n data points Xij, measured relative to the mean Xj We wish 
to find a new set of vectors Cj that are linearly dependent on Xi, and uncorrelated with 
each other. In addition, we require that they have stationary values of their variance. This 
condition is imposed to ensure that most of the variance is accounted for in as few vectors 
Ci as possible. Alternatively, this condition can be viewed geometrically as a rotation of the 
basis vectors such that the variance of the data with respect to each basis vector is maximal. 
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This is most easily understood in the case of a two dimensional data set, where PCA is 
equivalent to a linear fit to the data, and the rotation sets one basis vector parallel to that 
fit. The vectors Cj will be given by a linear combination of the original observations: 

i=i 

The vectors will form a new basis set for the observations. The variance of Cj is given by 



var Cj = ^ ^ Cji^kiejk, (A2) 
j=i k=i 



where ejk is the covariance between Xj and x^- The covariance matrix is given by 

E = -XX^, (A3) 

n 



where X is the matrix {xi, X2, . . . , x^). In order to get unique solutions for the values of 
we impose a normalization condition: 



E4 = l- (A4) 



i=l 



Finding the stationary values of var Cj (eq. IA2I) with the condition I A4I is equivalent to finding 
the stationary values of: 

mm / m \ 

i=i k=l \j=i J 



where A is some constant. To find the stationary values, we differentiate IA5I by and find 
the roots of the equation: 

m 

E ^i'^'^^jk - A^ij = 0. (A6) 

k=l 

This is an eigenvalue problem — the vector is an eigenvector of E and A is the correspond- 
ing eigenvalue. The eigenvectors form an orthonormal basis set and are called the 'principal 
components' of the data set. When they are ordered by decreasing eigenvalue A, the first 
principal component will have the largest possible variance, the second the second largest 
variance, and so on. In our work, we use the singular value decomposition of the covariance 
matrix to get the eigenvalues and vectors. The vectors Cj are the scaling coefficients for the 
principal components. 
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The PCA technique has several known weaknesses. The first is that its simplest im- 
plementation requires a completely filled covariance matrix — in other words there can be 
no missing data in X. This is a problem for us since not all modes are identified in each 
different observation epoch. There are generalizations of the technique which allow for large 
quantities of missing data. In our case, however, the modes in which we are interested tend 
to be identified in most of the mode sets, so we can simply interpolate the missing modes 
from the existing frequency measurements without any significant effect on the results. We 
do not, therefore, use a more specialized technique. 

A second problem with PCA is its sensitivity to outliers. One often-quoted example 
shows a PCA decomposition where the corre lation coeffic ient between the first two compo- 
nents with one outlier removed is 0.99 



le.g. 



Huberlll98ll ). There are available routines for 



making PCA more robust, but in our case, because the dimensionality of the problem is 
relatively small, we can instead empirically test the sensitivity of the data set to outliers. 
Monte Carlo tests show that our data set is not prone to errors in the PCA due to outliers. 

Principal Compon ent Analysis has been used in a wide variety of astronomical con- 
texts (see references in iMurtagh fc Heckl 119871 . Chapter2). In solar physics, PCA has been 



used for, among other thi ngs, the inversion of Stokes profiles (e.g., lEydenberg et al 



f 



Rarairez Velez et al.l l2006l ) , in detecting structures in coronal activity (e.g . , ICadavid et al. 



20071 ). and in helioseismic rotation inversions (e.g.. lEff-Darwich et al.l 120041 ) 



2005 
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Fig. 1. — The first four eigenvectors for tlie MDI data set are sliown. Tlie base set is 
7^1216 (1996 May 1). The left-hand panels show the eigenvectors as a function of frequency, 
and the right-hand panels show it as a function of the lower turning point of the mode. The 
vertical line shows the position of the base of the convection zone. The vertical axis units are 
arbitrary (the vectors are normalized so that ■ = 1). The error bars show representative 
errors calculated using a Monte Carlo simulation. 
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Fig. 2. — The scaling coefficients for the ffist four eigenvectors are shown as points. They 
are shown as a function of time (the start date of the MDI mode sets) . The dotted hue is 
the change in 10.7cm radio flux with respect to the beginning of the solar cycle. The units 
are Solar Flux Units (SFU). 
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Fig. 3. — To demonstrate that the PCA representation of the data does indeed capture 
the actual data, we present two reconstructed data sets and compare them to the actual 
data. Two frequency difference sets are examined: sets #3160 (2001 August 27, panel a) 
and # 4528 (2005 May 26, panel b), both with respect to # 1216. At left, the frequency 
differences are plotted (panel (c) for set #3160 and panel (d) for set #4528). The actual 
data is shown in red, and the reconstructed data set using and ^2 ^^e overplotted in black. 
The normalized residuals are binned and shown against a Gaussian curve to show that the 
information which has been removed is statistically random. 
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much less obviously. The eigenvector ^2 shows no structure, unlike the equivalent eigenvector 
for MDI, implying that the MDI ^2 is an instrumental artifact. 
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Fig. 5. — Inversion for sound speed of the eigenvector. The top panel shows the inversion 
of the MDI data. The lower panel shows the inversion of the GONG data. The solid cyan 
line is the result from the RLS inversion (the dotted lines are the vertical error bounds). The 
red points are the results from the SOLA inversion. The horizontal dashed line is the zero- 
point. The vertical dashed line represents the location of the base of the convection zone. 
At the convection zone base, the MDI inversion results show a clear depression in sound 
speed at high activity (the sense of the inversion is low activity minus high activity) and an 
enhancement in the tachocline region. The depression is matched in the GONG inversion 
results. The location of this feature, though slightly deeper, is within the horizontal errors 
of the MDI result. 
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Fig. 6. — Change in inferred sound speed as function of activity level (10.7cm radio flux) is 
shown for different radii around the base of the convection zone. The shaded regions show 
the errors for each set of inversions. 
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Fig. 7. — The first eigenvector ^i(^) for latitudes from 0° to 75°. The data are from MDI and 
are relative to the #1216 mode set. The left-hand panels show the eigenvectors as functions 
of frequency, and the right-hand panels as functions of the lower turning points. 
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Fig. 8. — The scaling coefficients are plotted as a function of time and latitude. The top 
panel shows the coefficients for each individual eigenvector ^i{0). The bottom panel shows 
the scaling coefficients for all the latitudes as a function of the ^^(15°). This shows how 
the changes represented by that eigenvector change as a function of both time and latitude. 
The average unsigned magnetic flux from MDI carrington rotation synoptic maps over each 
72-day period is shown in contour. The contours are spaced every 52G, with the lowest at 
56. 5G. The vertical bars in 1998 are gaps in MDI coverage due to spacecraft problems. 
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Fig. 9. — The inversions of the latitudinal frequencies are shown for four different latitudes 
from the equator to 45°. The differences are with respect to MDI set #1216. As expected 
from an examination of the raw frequencies, there are no discernible features in the inver- 
sion for 45° and above — the inversions are extremely unstable to the choice of inversion 
parameters. At 15° and the equator, a significant (about 2cr) enhancement in sound speed 
at high activity is observed above approximately r = O.QSi?©. As before, the vertical dashed 
line indicates the position of the base of the convection zone (r = O.TlSi?©). 
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Fig. 10. — The ^-^ eigenvectors for the asphericity terms — equator minus latitude — are 
shown for five different latitudes. As usual, the left panels show the eigenvectors with respect 
to frequency, and the right hand panels are with respect to lower turning point. 
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Fig. 11. — The scaling coefficients for the asphericity eigenvectors are shown. The dotted 
hne is the -F10.7 flux. The 0° — 15° and 0° — 30° both show a phase shift, consistent with 
the butterfly diagram from Fig. [HI The higher latitudes show a distinct change from mostly 
positive to negative at high activity. 



